MTI for Full Text

ثبت نشده
چکیده

To provide a stable base for the experiments with full text, the MTI indexing paths were run separately on each of the sections of the full text test collection. The output from each indexing path was saved and subsequently used by MTI for all of the experimental processing. The evaluation in the phase 1 experiments reported in the AMIA paper was based on the human indexing extracted from MEDLINE in December of 2003. Since new work planned for phase 2 would change the way the text from the articles was separated into sections new processing by the indexing paths in MTI was necesary. Since this processing uses the current PubMed database, the processing and evaluation must be done in the current environment. Consequently, moving our experiments to the new environment required establishing new baselines. In addition, the baseline is based on the current production version of MTI, so we prepared an updated version of the experimental, section-handling version of MTI to allow a valid comparison of their results. Finally, to support a current evaluation we extracted from Medline the indexing for the test collection articles to serve at the gold standard. Using that environment new performance baselines were established using the current production version of MTI and the updated, experimental version of MTI.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

12 years on – Is the NLM medical text indexer still useful and relevant?

BACKGROUND Facing a growing workload and dwindling resources, the US National Library of Medicine (NLM) created the Indexing Initiative project in 1996. This cross-library team's mission is to explore indexing methodologies for ensuring quality and currency of NLM document collections. The NLM Medical Text Indexer (MTI) is the main product of this project and has been providing automated indexi...

متن کامل

Extracting Characteristics of the Study Subjects from Full-Text Articles

Characteristics of the subjects of biomedical research are important in determining if a publication describing the research is relevant to a search. To facilitate finding relevant publications, MEDLINE citations provide Medical Subject Headings that describe the subjects' characteristics, such as their species, gender, and age. We seek to improve the recommendation of these headings by the Med...

متن کامل

Semi-Automatic Indexing of Full Text Biomedical Articles

The main application of U.S. National Library of Medicine's Medical Text Indexer (MTI) is to provide indexing recommendations to the Library's indexing staff. The current input to MTI consists of the titles and abstracts of articles to be indexed. This study reports on an extension of MTI to the full text of articles appearing in online medical journals that are indexed for Medline. Using a col...

متن کامل

Identification of Important Text in Full Text Articles Using Summarization

Other research has shown that although the abstract is more information dense, the full text of a scientific article in the biomedical domain has much greater information content.1 We know from observing indexers and studying their indexing process that some of the assigned MeSH concepts do not appear in the abstract. The indexing manual also dictates that the abstract should not be used during...

متن کامل

Semi automatic indexing of PostScript files using Medical Text Indexer in medical education.

At Albert Einstein College of Medicine a large part of online lecture materials contain PostScript files. As the collection grows it becomes essential to create a digital library to have easy access to relevant sections of the lecture material that is full-text indexed; to create this index it is necessary to extract all the text from the document files that constitute the originals of the lect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005